Search CORE

32 research outputs found

2.5K-Graphs: from Sampling to Generation

Author: Gjoka Minas
Kurant Maciej
Markopoulou Athina
Publication venue
Publication date: 17/08/2012
Field of study

Understanding network structure and having access to realistic graphs plays a central role in computer and social networks research. In this paper, we propose a complete, and practical methodology for generating graphs that resemble a real graph of interest. The metrics of the original topology we target to match are the joint degree distribution (JDD) and the degree-dependent average clustering coefficient (

\bar{c}(k)

). We start by developing efficient estimators for these two metrics based on a node sample collected via either independence sampling or random walks. Then, we process the output of the estimators to ensure that the target properties are realizable. Finally, we propose an efficient algorithm for generating topologies that have the exact target JDD and a

\bar{c}(k)

close to the target. Extensive simulations using real-life graphs show that the graphs generated by our methodology are similar to the original graph with respect to, not only the two target metrics, but also a wide range of other topological metrics; furthermore, our generator is order of magnitudes faster than state-of-the-art techniques

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California

A Network Coding Approach to Loss Tomography

Author: Fragouli Christina
Gjoka Minas
Markopoulou Athina
Sattari Pegah
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Network tomography aims at inferring internal network characteristics based on measurements at the edge of the network. In loss tomography, in particular, the characteristic of interest is the loss rate of individual links and multicast and/or unicast end-to-end probes are typically used. Independently, recent advances in network coding have shown that there are advantages from allowing intermediate nodes to process and combine, in addition to just forward, packets. In this paper, we study the problem of loss tomography in networks with network coding capabilities. We design a framework for estimating link loss rates, which leverages network coding capabilities, and we show that it improves several aspects of tomography including the identifiability of links, the trade-off between estimation accuracy and bandwidth efficiency, and the complexity of probe path selection. We discuss the cases of inferring link loss rates in a tree topology and in a general topology. In the latter case, the benefits of our approach are even more pronounced compared to standard techniques, but we also face novel challenges, such as dealing with cycles and multiple paths between sources and receivers. Overall, this work makes the connection between active network tomography and network coding

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Crossref

The IT University of Copenhagen's Repository

Walking in Facebook: A Case Study of Unbiased Sampling of OSNs

Author: Athina Markopoulou
Carter T. Butts
Maciej Kurant
Minas Gjoka
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Abstract—With more than 250 million active users [1], Face-book (FB) is currently one of the most important online social networks. Our goal in this paper is to obtain a representative (unbiased) sample of Facebook users by crawling its social graph. In this quest, we consider and implement several candidate techniques. Two approaches that are found to perform well are the Metropolis-Hasting random walk (MHRW) and a re-weighted random walk (RWRW). Both have pros and cons, which we demonstrate through a comparison to each other as well as to the ”ground-truth ” (UNI- obtained through true uniform sampling of FB userIDs). In contrast, the traditional Breadth-First-Search and Random Walk (without re-weighting) perform quite poorly, producing substantially biased results. In addition to offline performance assessment, we introduce online formal convergence diagnostics to assess sample quality during the data collection process. We show how these can be used to effectively determine when a random walk sample is of adequate size and quality for subsequent use (i.e., when it is safe to cease sampling). Using these methods, we collect the first to the best of our knowledge unbiased sample of Facebook. Finally, we use one of our representative datasets, collected through MHRW, to characterize several key properties of Facebook. Index Terms—Measurements, online social networks, Facebook, graph sampling, crawling, bias. I

CiteSeerX

Crossref

Practical recommendations on crawling online social networks

Author: Athina Markopoulou
Carter T. Butts
Ieee Member
Maciej Kurant
Maciej Kurant
Minas Gjoka
Minas Gjoka
Publication venue
Publication date: 01/01/2011
Field of study

Our goal in this paper is to develop a practical framework for obtaining a uniform sample of users in an online social network (OSN) by crawling its social graph. Such a sample allows to estimate any user property and some topological properties as well. To this end, first, we consider and compare several candidate crawling techniques. Two approaches that can produce approximately uniform samples are the Metropolis-Hasting random walk (MHRW) and a re-weighted random walk (RWRW). Both have pros and cons, which we demonstrate through a comparison to each other as well as to the “ground truth. ” In contrast, using Breadth-First-Search (BFS) or an unadjusted Random Walk (RW) leads to substantially biased results. Second, and in addition to offline performance assessment, we introduce online formal convergence diagnostics to assess sample quality during the data collection process. We show how these diagnostics can be used to effectively determine when a random walk sample is of adequate size and quality. Third, as a case study, we apply the above methods to Facebook and we collect the first, to the best of our knowledge, representative sample of Facebook users. We make it publicly available and employ it to characterize several key properties of Facebook

CiteSeerX

A Network Coding Approach to Network Tomography

Author: Fragouli Christina
Gjoka Minas
Markopoulou Athina
Publication venue
Publication date: 12/11/2009
Field of study

Network tomography aims at inferring internal network characteristics based on measurements at the edge of the network. In loss tomography, in particular, the characteristic of interest is the loss rate of individual links. There is a signiﬁcant body of work dedicated to this problem using multicast and/or unicast end-to-end probes. Independently, recent advances in network coding have shown that there are several advantages from allowing intermediate nodes to process and combine, in addition to just forward, packets. In this paper, we re-visit the problem of loss tomography in networks that have network coding capabilities. We design a novel framework for estimating link loss rates, which leverages network coding capabilities to improve several aspects of the tomography problem, including the identiﬁability of links, the tradeoff between accuracy of estimation and bandwidth efﬁciency, and the complexity of probe path selection. We present ﬁrst the case of tree topologies and then the case of general graphs. In the latter case, the beneﬁts of our approach are even more pronounced compared to standard techniques but we also face novel challenges, such as dealing with cycles and multiple paths between sources and receivers

Infoscience - École polytechnique fédérale de Lausanne

A network coding approach to IP traceback

Author: Athina Markopoulou
Minas Gjoka
Pegah Sattari
Publication venue
Publication date: 01/01/2010
Field of study

Abstract—Traceback schemes aim at identifying the source(s) of a sequence of packets and the nodes these packets traversed. This is useful for tracing the sources of high volume traffic, e.g., in Distributed Denial-of-Service (DDoS) attacks. In this paper, we are particularly interested in Probabilistic Packet Marking (PPM) schemes, where intermediate nodes probabilistically mark packets with information about their identity and the receiver uses information from several packets to reconstruct the paths they have traversed. Our work is inspired by two observations. First, PPM is essentially a coupon collector’s problem [1], [2]. Second, the coupon collector’s problem significantly benefits from network coding ideas [3], [4]. Based on these observations, we propose a network coding-based approach (PPM+NC) that marks packets with random linear combinations of router IDs, instead of individual router IDs. We demonstrate its benefits through analysis. We then propose a practical PPM+NC scheme based on the main PPM+NC idea, but also taking into account the limited bit budget in the IP header available for marking and other practical constraints. Simulation results show that our scheme significantly reduces the number of packets needed to reconstruct the attack graph, in both single- and multi-path scenarios, thus increasing the speed of tracing the attack back to its source(s). I

CiteSeerX

Crossref